A large vocabulary continuous speech recognition hybrid system for the portuguese language

نویسندگان

  • João Paulo da Silva Neto
  • Ciro Martins
  • Luís B. Almeida
چکیده

Due to the enormous development of large vocabulary, speaker-independent continuous speech recognition systems, which occur essentially for the US English language, there is a large demand of this kind of systems for other languages. In this paper we present the work done in the development of a large vocabulary, speaker-independent continuous speech recognition hybrid system for the European Portuguese language. This is a difficult task due to the basic development stage of this technology in the European Portuguese language. The development of a system of this kind for a new language depends on the availability of the appropriate source components, mainly a speech corpus and large amounts of texts. This work became possible due to the development of a new database (BD-PUBLICO), a large vocabulary speech corpus for the European Portuguese language developed by us over the last two years.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Spoken Term Detection for Persian News of Islamic Republic of Iran Broadcasting

Islamic Republic of Iran Broadcasting (IRIB) as one of the biggest broadcasting organizations, produces thousands of hours of media content daily. Accordingly, the IRIBchr('39')s archive is one of the richest archives in Iran containing a huge amount of multimedia data. Monitoring this massive volume of data, and brows and retrieval of this archive is one of the key issues for this broadcasting...

متن کامل

The development of a speaker independent continuous speech recognizer for portuguese

The development and evaluation of large vocabulary, speaker-independent continuous speech recognition systems are mainly done for the American English language. In this paper we present the work done to date in the development of an hybrid large vocabulary, speaker-independent continuous speech recognition system for the European Portuguese language. Due to the lack of a large appropriate speec...

متن کامل

The use of syllable segmentation information in continuous speech recognition hybrid systems applied to the Portuguese language

Recent works have showed that the use of syllables as the basic unit in a speech recognition system could be very useful. These works introduced methods exploiting syllable information as a mean to add robustness in ”traditional” systems that use phonemes/phones as the basic unit. Being the Portuguese a highly syllabic language we expected that information from syllables would introduce potenti...

متن کامل

THE DEVELOPMENT OF A SPEAKER INDEPENDENT CONTINUOUSSPEECH RECOGNIZER FOR PORTUGUESEJo

The development and evaluation of large vocabulary , speaker-independent continuous speech recognition systems are mainly done for the American En-glish language. In this paper we present the work done to date in the development of an hybrid large vocabulary, speaker-independent continuous speech recognition system for the European Portuguese language. Due to the lack of a large appropriate spe...

متن کامل

Speech Recognition of Broadcast News for the European Portuguese Language

This paper describes our work on the development of a large vocabulary continuous speech recognition system applied to a Broadcast News task for the European Portuguese language in the scope of the ALERT project. We start by presenting the baseline recogniser AUDIMUS, which was originally developed with a corpus of read newspaper text. This is a hybrid system that uses a combination of phone pr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998